Enhancement Of A Chinese Discourse Marker Tagger With C4.5

نویسندگان

  • Benjamin K. Tsou
  • Tom Bong-Yeung Lai
  • Samuel W. K. Chan
  • Weijun Gao
  • Xuegang Zhan
چکیده

Discourse markers are complex discontinuous linguistic expressions which are used to explicitly signal the discourse structure of a text. This paper describes efforts to improve an automatic tagging system which identifies and classifies discourse markers in Chinese texts by applying machine learning (ML) to the disambiguation of discourse markers, as an integral part of automatic text summarization via rhetorical structure. Encouraging results are reported.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fine-Grained Chinese Discourse Relation Labelling

This paper explores several aspects together for a fine-grained Chinese discourse analysis. We deal with the issues of ambiguous discourse markers, ambiguous marker linkings, and more than one discourse marker. A universal feature representation is proposed. The pair-once postulation, cross-discourse-unit-first rule and word-pair-marker-first rule select a set of discourse markers from ambiguou...

متن کامل

Topic Identification In Chinese Based On Centering Model

In this paper we are concerned with identifying the topics of sentences in Chinese texts. The key elements of the centering model of local discourse coherence are employed to identify the topic which is the most salient element in a Chinese sentence. Due to the phenomenon of zero anaphora occurring in Chinese texts frequently, in addition to the centering model, we further employ the constraint...

متن کامل

Acquisition of the perfective aspect marker Le of Mandarin Chinese in discourse by American college learners

Approved: ________________________ Thesis Supervisor ________________________ Title and Department ________________________ Date ACQUISITION OF THE PERFECTIVE ASPECT MARKER LE OF MANDARIN CHINESE IN DISCOURSE BY AMERICAN COLLEGE LEARNERS

متن کامل

Sentence Classification Experiments for Legal Text Summarisation

We describe experiments in building a classifier which determines the rhetorical status of sentences. The research is part of a text summarisation project for the legal domain and we use a newly compiled and annotated corpus of judgments of the UK House of Lords. Rhetorical role classification is an initial step which provides input to the sentence selection component of the system. We report r...

متن کامل

Disambiguating potential connectives

Many discourse connectives also have nondiscourse, or sentential readings. Therefore, for automatic discourse structure analysis, there arises a disambiguation problem even before the question of signalled discourse relation beomes relevant. We focus here on a set of nine German connectives and characterize the task of determining their discourse/sentential reading. Starting from an analysis of...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000